Optimal DALI protein structure alignment
نویسندگان
چکیده
We present a mathematical model and exact algorithm for protein structure alignment using dali scoring, which is an NP-hard problem. dali scoring is based on comparing the inter-residue distance matrices of proteins and is the scoring model of the widely used heuristic dali program. Our model and algorithm extend an integer linear programming approach previously used for the related contact map overlap problem. To this end, we introduce a novel type of constraint that handles negative structure scores and relax it in a Lagrangian fashion. We also review options that allow to consider less pairs of inter-residue distances explicitly, because their large number makes it difficult to optimize dali scoring optimally. We use our exact algorithm dalix to compute many provably score-optimal dali alignments for the first time, using four data sets of varying structural similarity. Further, using our exact dalix alignments, it is for the very first time possible to qualitatively benchmark the heuristic dali program in sound mathematical terms. The results indicate that dali often computes optimal or close to optimal alignments, but also that in cases of aligning small proteins it tends to fail generating any significant alignment although such an alignment exists.
منابع مشابه
Matalign: Precise Protein Structure Comparison by Matrix Alignment
We propose a detailed protein structure alignment method named "MatAlign". It is a two-step algorithm. Firstly, we represent 3D protein structures as 2D distance matrices, and align these matrices by means of dynamic programming in order to find the initially aligned residue pairs. Secondly, we refine the initial alignment iteratively into the optimal one according to an objective scoring funct...
متن کاملAlgorithm engineering for optimal alignment of protein structure distance matrices
Protein structural alignment is an important problem in computational biology. In this paper, we present first successes on provably optimal pairwise alignment of protein inter-residue distance matrices, using the popular dali scoring function. We introduce the structural alignment problem formally, which enables us to express a variety of scoring functions used in previous work as special case...
متن کاملSensitivity and selectivity in protein structure comparison.
Seven protein structure comparison methods and two sequence comparison programs were evaluated on their ability to detect either protein homologs or domains with the same topology (fold) as defined by the CATH structure database. The structure alignment programs Dali, Structal, Combinatorial Extension (CE), VAST, and Matras were tested along with SGM and PRIDE, which calculate a structural dist...
متن کاملProtein structure alignment by incremental combinatorial extension (CE) of the optimal path.
A new algorithm is reported which builds an alignment between two protein structures. The algorithm involves a combinatorial extension (CE) of an alignment path defined by aligned fragment pairs (AFPs) rather than the more conventional techniques using dynamic programming and Monte Carlo optimization. AFPs, as the name suggests, are pairs of fragments, one from each protein, which confer struct...
متن کاملProtein folds and families: sequence and structure alignments
Dali and HSSP are derived databases organizing protein space in the structurally known regions. We use an automatic structure alignment program (Dali) for the classification of all known 3D structures based on all-against-all comparison of 3D structures in the Protein Data Bank. The HSSP database associates 1D sequences with known 3D structures using a position-weighted dynamic programming meth...
متن کامل